-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Base jdbc MERGE support #16944
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Base jdbc MERGE support #16944
Conversation
|
I was thinking to merge #16445 first. Is that ok to you? |
Sure! |
|
@kokosing, @chenjian2664 Thanks! |
|
@chenjian2664 Would you like to rebase this PR? I know that @vlad-lyutenko is going to improve UPDATE a bit still. There are some rough edges that we would like to improve. |
I remember it's because the performance is too bad on SQL Server |
Sure
Are you suggest that we could keep on suspend this PR? |
It is was suspended already too long. It is independent effort. |
Support update/merge for Postgresql Support update/merge for Ignite Support update/merge for Oracle Refactor Phoenix merge implementation by reuse base jdbc merge implementation.
|
This pull request has gone a while without any activity. Tagging the Trino developer relations team: @bitsondatadev @colebow @mosabua |
|
👋 @kokosing @chenjian2664 @sajjoseph can you collaborate to figure out best next steps here? |
|
This pull request has gone a while without any activity. Tagging the Trino developer relations team: @bitsondatadev @colebow @mosabua |
|
Closing this pull request, as it has been stale for six weeks. Feel free to re-open at any time. |
Description
#16709
Additional context and related issues
This pr is going to support MERGE for jdbc based connectors, follows the #16693. Here are some major changes made in this pr:
JdbcClient & BaseJdbcClient:
Add
getPrimaryKeysmethod to get primary keys for the merge target table, the method must not return empty if the connector declares supporting merge. The default implementation inBaseJdbcClient, and extracted functionextractJdbcHandlesFromResultSetfromgetColumnsfor reuse it. Currently we use primary keys for perform delete and update for the merge.Using return result of
getPrimaryKeysto check if the connector support merge.Add methods
beginDeleteTableForMergeandfinishDeleteTableForMergefor supporting FTE about the merge, the logic will be used if the connector supports retry and use transactional insert. The default implementation inBaseJdbcClient:beginDeleteTableForMergemethod uses primary keys to build a temporary tablefinishDeleteTableForMergeuses syntaxDELETE FROM merge_target WHERE EXISTS (SELECT 1 FROM (temp_table_data ) temp WHERE "getConjunctsBetweenTargetAndTemporaryTable(merge_target, temp)" )to perform actual delete.temp_table_datais the sth likeSELECT * FROM temp_table WHERE EXISTS (SELECT 1 FROM page_sink_table WHERE page_sink_table.id_column = temp_table.id_column)The logic is the same as originalINSERToperation that using temporary table(INSERT INTO insert_target SELECT * FROM temp_table WHERE EXISTS (SELECT 1 FROM page_sink_table WHERE page_sink_table.id_column = temp_table.id_column)), thetemp_tableis built inbeginDeleteTableForMergeThe connectors returns special primary keys that can not build temporary table with get primary keys directly , i.e PostgreSql returns
ctidas the primary key, it is a hidden column for every table, while building temporary table storing the column data we need to rename the special column for avoiding conflicts, in PostgreSql we build the temporary table with column namectid_for_delete_mergeto store thectidvalues, then usegetConjunctsBetweenTargetAndTemporaryTableto use correct column name to build the condition between target and temp table .Add method
buildMergeRowIdConjunctsfor the connectors that not support(or disable) building temporary table to perform delete, the method builds theWHEREcondition for clauseDELETE FROM merge_target WHERE...using primary keys (called inJdbcMergeSink)The method
updateScanColumnsForMergeis the same as we support merge in Phoenix, it updates the scans for including the all the primary keys if possible.DefaultJdbcMetadata:
deleteForMergeRollbackActionfield for the rollback about the delete operation if the connectors supports delete using temporary table.getMergeRowIdColumnHandlewill check if the connector support merge, if yes it will return the column handle that composed by the primary keys(returned by getPrimaryKeys in client)beginMergewill pass the all the infos that needed by the merge, it's similar to we do in Phoenix connector, the difference is we add abeginDeleteMergeprocess for connectors that will use temporary table to perform delete.finishMergecalls thefinishInsertandfinishDeleteForMergeJdbcMergeSink:
It's moved from the Phoenix implementation, difference are:
UPDATE merge_target SET reserved_columns WHERE..., thereserved_columnsare the columns who are not the primary keys, the values come from channels that provided bygetReservedChannelsForUpdate, and theWHEREcondition is built byjdbcClient.buildMergeRowIdConjuncts.The connectors:
Implementation about the
getPrimaryKeys:ctidROWIDrowkeydefined in tableOther Jdbc connectors Like Mysql actual can perform the merge as long as the table has defined the primary key, but since it is not forced by the connector, we currently let the
BaseJdbcClient.supportsMergereturn false by default and not declare it supports merge.Tests:
PostgreSql and Oracle support FTE, so the defined the
supportMergein theBaseJdbcFailureRecoveryTestfor indicating the behavior, aslo update the test cases through the method inBaseJdbcFailureRecoveryTest.Adjust lots of merge tests in Oracle due to naming length constraints.
Release notes
( ) This is not user-visible or docs only and no release notes are required.
(x) Release notes are required, please propose a release note for me.
( ) Release notes are required, with the following suggested text: